07. Attention Overview

True or False: A sequence-to-sequence model processes the input sequence all in one step

SOLUTION: False - a seq2seq model works by feeding one element of the input sequence at a time to the encoder

Which of the following is a limitation of seq2seq models which can be solved using attention methods?

SOLUTION:
  • The fixed size of the context matrix passed from the encoder to the decoder is a bottleneck
  • Difficulty of encoding long sequences and recalling long-term dependancies

How large is the context matrix in an attention seq2seq model?

SOLUTION: Depends on the length of the input sequence